context dependent modeling in continuous speech recognition based on a persian phonetic decision tree

نویسندگان

seyed hosein shams

seyed mohammad ahadi

چکیده

context-dependent modeling is a well-known approach to increase modeling accuracy in continuous speech recognition. the most common way to implement this approach is via triphone modeling. nevertheless, the large number of such models results in several problems in model training, whilst the robust training of such models is often hardly obtained. one approach to solve this problem is via parameter tying. in this paper, clustering has been carried out on hmm state parameters and the states allocated to any cluster are tied to decrease the overall number of system parameters and achieve robust training. two types of groupings, one based on the final trained model set parameters and their inter-model distances and the other based on the training data and a decision tree, have been carried out. in the implementation of the later, a decision tree based on the acoustic properties of the persian (farsi) language and the phonetic similarities and differences has been designed. the results obtained have shown the usefulness of both the approaches. however, the second approach has the advantage of making the estimation of unseen model parameters possible.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Context dependent tree based transforms for phonetic speech recognition

This paper presents a novel method for modeling phonetic context using linear context transforms. Initial investigations have shown the feasibility of synthesising context dependent models from context independent models through weighted interpolation of the peripheral states of a given hidden markov model with its adjacent model. This idea can be further extended, to maximum likelihood estimat...

متن کامل

Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese

We study the problem of phonetic modeling for continuous Mandarin speech recognition by providing a systematic performance comparison for systems based on following primitive speech units: syllable, demi-syllable (Initials and Finals), context-independent phones, left-or-right context-dependentphones (diphones), and leftand-right context-dependent phones (triphones). In our speakerdependent con...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of contextdependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven cluste...

متن کامل

Decision Tree-Based Context Dependent Sublexical Units for Continuous Speech Recognition of Basque

This paper presents a new methodology, based on the classical decision trees, to get a suitable set of context dependent sublexical units for Basque Continuous Speech Recognition (CSR). The original method proposed by Bahl [1] was applied as the benchmark. Then two new features were added: a data massaging to emphasise the data and a fast and efficient Growing and Pruning algorithm for DT const...

متن کامل

Syllable structure based phonetic units for context-dependent continuous Thai speech recognition

Choice of the phonetic units speech recognizer is a factor greatly affecting the system performance. Phonetic units are normally defined according to the acoustic properties of a speech. Nevertheless, with the limit of training data, too delicate acoustic properties are ignored. Syllable structure is one of the properties usually ignored in English phonetic units due to a lot of possible onsets...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید


عنوان ژورنال:
the modares journal of electrical engineering

ناشر: tarbiat modares university

ISSN 2228-527 X

دوره 3

شماره 1 2003

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023